Cross-population variation in usage of a call combination: evidence of signal usage flexibility in wild bonobos

The arbitrary relationship between signifier and signified is one of the features responsible for language’s extreme lability, adaptability, and expressiveness. Understanding this arbitrariness and its emergence is essential in any account of the evolution of language. To shed light on the phylogeny of the phenomenon, comparative data examining the relationship between signal form and function in the communication systems of non-humans is central. Here we report the results of a study on the production and usage the whistle-high hoot call combination (W + HH) from two distant populations of wild bonobos (Pan paniscus): Lui Kotale, DRC, and Kokolopori, DRC. We find that the context in which bonobos produce the W + HHs varies systematically between populations. Our results suggest that variation in W + HH production may represent an example of signal-adjustment optionality, a key component of arbitrariness. Supplementary Information The online version contains supplementary material available at 10.1007/s10071-024-01884-4.


Introduction
The relationship between a signal's form and its function is foundational to all systems of communication.The nature of the association profoundly influences the expressive potential of a communication system (Hockett 1960).In language, the relationship between a word's sound and its meaning is said to be 'arbitrary' because in The extent to which these examples of sound change, or 'signal-adjustment optionality' (sensu Watson et al. 2022), are mirrored by a similar capacity for 'semantic change', or 'signal-usage optionality' (sensu Watson et al. 2022), is largely unknown.Both pig-tailed macaques and Japanese macaques have been shown to be able to be trained by humans to use 'coo' calls to request particular items from humans, demonstrating the ability to use an existing signal in service of a novel function (Coudé et al. 2011;Hihara et al. 2003).These examples suggest that the association between a signal and its function is not immutable.However, the extent to which this potential for 'signal-usage optionality' occurs in the natural communication of nonhumans is largely unknown.
We address this question by comparing usage of a complex signal, the whistle-high hoot call combination (W + HH), produced in two populations of wild bonobos separated by 455k (the Kokolopori and LuiKotale field sites).Previous work has demonstrated the call combination is an antiphonal signal used during interparty communication (Schamberg et al. 2016(Schamberg et al. , 2017)).In the current study we present data on the contexts in which W + HHs are produced in order to investigate potential shifts in call usage across populations.

Study sites and subjects
Data for this study were collected at two field sites: LuiKotale (Hohmann and Fruth 2003) and Kokolopori (Surbeck et al. 2017).Home ranges for both communities were located in dense rainforest consisting of large patches of both terra firma and swamp forest.The two field sites are separated by 455 km.Several impassable rivers render any migration or contact between the bonobos at the two field sites impossible (see Fig. 1).Details of each field site are given below.

LuiKotale field site
For 13 months between July 2011 and March 2014, Isaac Schamberg (IS) sampled behavior and recorded vocalizations from 18 adults (7 males and 11 females, aged 10 + years) from a single group of bonobos (the Bompusa community) at the LuiKotale field site in the Mai-Ndombe province of the Democratic Republic of Congo (DRC).Individuals in this community have been studied continuously since 2002 and were fully habituated and identified at the beginning of the study.

Kokolopori field site
From January-May 2018, IS sampled behavior and recorded vocalizations from two groups of bonobos (the Ekalakala and Kokoalongo communities) at the Kokolopori field site in the Equateur, DRC.The Ekalakala community consisted of 3 adult males, and 6 adult females.The Kokoalongo community consisted of 10 adult males, and 17 adult females.Bonobos in these communities have been studied continuously since 2016 and were fully habituated from the beginning of the data collection period.Given the small number of individuals in the Ekalakala community and the similar patterns observed across the two Kokolopori communities, we combined data from the Ekalakala and Kokoalongo communities into a single Kokolopori dataset (but see SI for data from each individual community).

Data collection and processing
Subjects were followed on foot and observations were conducted between 0600 and 1800.IS recorded vocalizations and accompanying behavior with an audio recorder (Marantz PMD 660; Sennheiser directional microphone) and later transcribed the recordings.Data were collected over the course of 1515 observation hours (1224 h at Lui-Kotale and 291 h at Kokolopori).
When an individual produced a vocalization, the observer noted the call type, the caller's behavior, the identity of individuals within 10 m of the caller, immediate behavioral change after the call, and all vocalizations produced by the caller and by other individuals that preceded or followed the call.Initial classification of call types in the field were later confirmed through visual inspection of spectrograms using the criteria laid out in published accounts of the bonobo vocal repertoire (de Waal 1988;Bermejo and Omedes 1999).We also evaluated the reliability of our categorization of call types by conducting a test of inter-rater reliability, in which a second coder, who was naïve to the hypotheses of the current study, classified 171 call units from 25 recordings from our dataset (representing 10% of the total dataset).The two coders agreed on 95% (161/171) of categorizations of 'whistles' and 'high hoots' (Cohen's kappa = 0.91), indicating high degree of reliability in call categorization.
We classified each call combination in the dataset into one of four contexts, which were defined as follows: Travel: calls produced while the caller was walking terrestrially were classified in the 'travel' context.If an individual paused his/her travel for less than one minute, the utterance was still considered t0 have been produced in a travel context.Arrival: calls produced two minutes before or after arriving at a feeding patch were classified in the 'arrival' context.Call production typically occurred near the base of a fruiting tree, or in a fruiting tree prior to feeding.
Feeding: calls produced while the caller was actively feeding were classified in the 'feeding' context.If an individual paused his/her feeding for less than one minute, the utterance was still considered t0 have been produced in a feeding context.
Rest: calls produced while callers were stationary (but not currently feeding) were classified in the 'rest' context.

Relationship between population and HH context
To investigate whether any between-population differences in W + HH production were specific to the call combination, or simply reflective of more general pattern of behavioral or vocal change across communities, we also examined usage of HH between the LuiKotale and Kokolopori populations.We classified the production context of each HH bout using the same criteria we used for classifying the W + HH combinations.We fitted the same generalized linear mixed model described above, but the response variable was the number of HHs-instead of W + HHs-produced by each individual in each context.If HH and W + HH usage co-vary across the two populations, such variation can be explained as part of a broader change in behavior.Conversely, if HH and W + HH usage do not co-vary across the two populations, variation in either one of the signals-HHs or W + HHs-would likely be evidence of a signal-specific difference between the populations.
For Kokolopori, all high-quality recordings of HHs produced by individuals who also produced at least one W + HH were included in our dataset.HHs from LuiKotale came from the same dataset as was used in Schamberg et al. (2016).

Whistle-high hoots
At the Kokolopori field site, we recorded 42 W + HH combinations from 15 individuals (8 males, 7 females).At the LuiKotale field site, we recorded 51 W + HH combinations from 14 individuals (7 males, 7 females).At both field sites, call order was invariant: 100% of W + HH combinations consisted of an initial whistle, followed by one or more high hoots (see Fig. 2).

Statistical analysis
Statistical analysis was conducted in R 3.6.1 GUI 1.70 El Capitan build.Although our sample size was somewhat constrained, we still opted for using GLMMs since our dataset exceeded minimum thresholds (N = 25) proposed in recent meta-analyses examining the use GLMMs with limited datapoints (Jenkins and Quintana-Ascencio 2020).

Relationship between population and W + HH context
To investigate potential differences in the usage of W + HH call combinations between the Luikotale and Kokolopori populations, we fitted a generalized linear mixed model with a poisson error structure and log link function using the function 'glmer' in R package 'lme4' (lme4 (version 1.1-27.1).The response variable was the number of W + HH call combinations each individual produced in the four contexts.Each individual, therefore, is represented by four data points (one for each call context).We included the following predictor variables: population (LuiKotale/Kokolopori), context (arrival/feeding/rest/ travel), the observation time of each individual, and the interaction between population and context.We included observation time as a predictor variable to control for variation in observation time between individuals.Individual identity was entered as a random effect.
Because W + HHs are produced somewhat rarely (Schamberg et al. 2016), we included all observed W + HHs in our dataset.no difference in the production of HHs between the two populations.

Discussion
Our results reveal a cross-population difference in bonobos' use of the whistle-high hoot (W + HH) call combination.Bonobos at the Kokolopori field site were significantly more likely to produce W + HHs upon arrival at a feeding tree, compared to bonobos at the LuiKotale field site.In contrast, we found no difference in the usage of high hoots (HHs) between the two populations.The contrasting results of HHs and W + HHs indicate that the shift in W + HH usage observed between LuiKotale and Kokolopori does not reflect a broader change in activity budgets or a general tendency to vocalize in particular contexts.Rather, it suggests that the difference in W + HH usage is potentially an example of signal-usage optionality with individuals in the two populations using the W + HH combination for subtly different purposes.This variation in the context of call production-and the possible accompanying change in signal meaning-provides, to our knowledge, some of the first evidence of a capacity for a signal usage adjustment among wild primates.
At both sites, bonobos produced W + HHs in all four contexts, but the predominant context accompanying call production differed between the two populations (Fig. 3).At Kokolopori, the majority (22/42) of W + HHs were produced upon arrival at a fruiting tree.At LuiKotale, a plurality (20/52) of W + HHs were produced while resting.Comparison of the null model (which included ''population' and 'context' as predictor variables) and the full model (which included 'population', 'context', and an the 'population*context' interaction) was significant (df = 3, χ 2 = 20.67,p < 0.001), indicating that there was a difference in the context of call production between the two populations.Results from Model 1 suggest that this difference may be driven by a high proportion of W + HHs produced in the arrival context by bonobos at Kokolopori (β=-2.7,SE = 0.8, z=-3.4,p = 0.007) (See S1 for full results of Model 1).

High hoots
At both sites, bonobos produced a clear majority of high hoots during periods of feeding or resting (75/95 at Lui-Kotale and 37/51 at Kokolopori).Comparison of the null model (which included ''population' and 'context' as predictor variables) and the full model (which included 'population', 'context', and an the 'population*context' interaction) was not significant (df = 3, χ2 = 4.311, p = 0.230), indicating Fig. 3 The proportion of whistle-high hoot (W + HH) call combinations produced in each of four contexts (arrival, feed, rest, and travel) among two populations of bonobos differences in call usage should be common.To our knowledge, such differences have rarely been examined, and would be a fruitful direction for future research.
Within the existing literature on flexibility in primate call usage, there are numerous examples of individuals modifying their vocal output as a function of context and/or social knowledge.For example, individuals may decide whether or not to call based on the identity of nearby individuals (e.g., Townsend et al. 2008;Kalan and Boesch 2015;Soldati et al. 2022), their relationship with a receiver (Silk et al. 2016), the direction of a receiver's gaze ' Schel et al. 2013), and even the knowledge state of receivers (Crockford et al. 2012).These examples demonstrate strategic, volitional call usage by individual callers.
Our results extend these findings by not only providing further evidence for volitional call production, but also demonstrating a group-wide pattern of convergent call usage.Such group-conforming call usage appears similar to social traditions that have been identified in other behavioral domains [e.g., Perry et al. 2003;Whiten et al. 2001).
To our knowledge, our findings provide the first preliminary evidence for a vocal tradition based on the manner in which a particular signal is used by a group of primates (but see Wich et al. 2012 for a vocal tradition based on call selection; and Crockford et al. 2004;Ford 1991;Yurk et al. 2002, andWeilgart andWhitehead 1997 for vocal traditions based on convergent acoustics).The capacity for subtly shifting the precise contexts of production of vocalization-may represent an important stepping stone towards the fully-fledged arbitrariness of human language.Comparing vocal usage across great ape communities and populations is essential to shed light on how widespread this phenomenon is and, as a consequence, how deeply rooted it may be within the primate lineage.
As a result of the current study's specific scope, our results raise more questions than they answer.We would like to highlight two perspectives that we hope future research will address.First, we documented a difference in W + HH usage between LuiKotale and Kokolopori, but our data do not allow for inferences about a possible concomitant difference in signal meaning across the two populations.Our results are equally consistent with two potential interpretations: (1) LuiKotale and Kokolopori bonobos produce W + HHs in different contexts because the meaning of the signal differs between the two populations; or (2) LuiKotale and Kokolopori bonobos produce W + HHs in different contexts, but the meaning of the signal is identical in both populations.A playback experiment probing receiver behavior in the two populations would help disentangle these competing hypotheses regarding signal meaning.
Second, there is good evidence that while primates are largely born with a fixed vocal repertoire, they must learn how to correctly use those hardwired call types (e.g., Wegdell et al. 2019).If learning really does play an important role in shaping how primate call usage, between-group 1 3

Fig. 1
Fig. 1 The locations of the Kokolopori and Luikotale field sites on a map of the Democratic Republic of Congo